IIIT Hyderabad at CLEF 2007 - Adhoc Indian Language CLIR Task

نویسندگان

  • Prasad Pingali
  • Vasudeva Varma
چکیده

This paper presents the experiments of Language Technologies Research Centre (LTRC) as part of their participation in CLEF 2007 Indian language to English ad-hoc cross language document retrieval task. In this paper we discuss our Hindi and Telugu to English CLIR system and the experiments using CLEF 2007 dataset. We used a variant of TFIDF algorithm in combination with a bilingual lexicon for query translation. We also explored the role of a document summary in fielded queries and two different boolean formulations of query translations. We find that a hybrid boolean formulation using a combination of boolean AND and boolean OR operators improves ranking of documents. We also find that simple disjunctive combination of translated query keywords results in maximum recall.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving Recall for Hindi, Telugu, Oromo to English CLIR

This paper presents the Cross Language Information Retrieval (CLIR) experiments of the Language Technologies Research Centre (LTRC, IIIT-Hyderabad) as part of our participation in the ad-hoc track of CLEF 2007. We present approaches to improve recall of query translation by handling morphological and spelling variations in source language keywords. We also present experiments using query expans...

متن کامل

Bengali, Hindi and Telugu to English Ad-hoc Bilingual Task at CLEF 2007

This paper presents the experiments carried out at Jadavpur University as part of participation in the CLEF 2007 ad-hoc bilingual task. This is our first participation in the CLEF evaluation task and we have considered Bengali, Hindi and Telugu as query languages for the retrieval from English document collection. We have discussed our Bengali, Hindi and Telugu to English CLIR system as part of...

متن کامل

IIIT Hyderabad’s CLIR experiments for FIRE-2008

This paper discourses our CLIR experiments performed for the FIRE workshop. We had submitted our runs for Adhoc monolingual document retrieval in Hindi and English, and Ad-hoc cross-lingual document retrieval from Hindi to English, and English to Hindi. In this paper, we describe our English to Hindi and Hindi to English CLIR systems and the experiments conducted on them using the FIRE2008 data...

متن کامل

Cross-Lingual Information Retrieval System for Indian Languages

This paper describes our first participation in the Indian language sub-task of the main Adhoc monolingual and bilingual track in CLEF competition. In this track, the task is to retrieve relevant documents from an English corpus in response to a query expressed in different Indian languages including Hindi, Tamil, Telugu, Bengali and Marathi. Groups participating in this track are required to s...

متن کامل

Hindi and Telugu to English CLIR using Query Expansion

This paper presents the experiments of Language Technologies Research Centre (LTRC) as part of their participation in CLEF2 2007 Indian language to English ad-hoc cross language document retrieval task. In this paper we discuss our Hindi and Telugu to English CLIR system and the experiments using CLEF 2007 dataset. We used a variant of TFIDF algorithm in combination with a bilingual lexicon for...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007